16 research outputs found
Core Decomposition in Multilayer Networks: Theory, Algorithms, and Applications
Multilayer networks are a powerful paradigm to model complex systems, where
multiple relations occur between the same entities. Despite the keen interest
in a variety of tasks, algorithms, and analyses in this type of network, the
problem of extracting dense subgraphs has remained largely unexplored so far.
In this work we study the problem of core decomposition of a multilayer
network. The multilayer context is much challenging as no total order exists
among multilayer cores; rather, they form a lattice whose size is exponential
in the number of layers. In this setting we devise three algorithms which
differ in the way they visit the core lattice and in their pruning techniques.
We then move a step forward and study the problem of extracting the
inner-most (also known as maximal) cores, i.e., the cores that are not
dominated by any other core in terms of their core index in all the layers.
Inner-most cores are typically orders of magnitude less than all the cores.
Motivated by this, we devise an algorithm that effectively exploits the
maximality property and extracts inner-most cores directly, without first
computing a complete decomposition.
Finally, we showcase the multilayer core-decomposition tool in a variety of
scenarios and problems. We start by considering the problem of densest-subgraph
extraction in multilayer networks. We introduce a definition of multilayer
densest subgraph that trades-off between high density and number of layers in
which the high density holds, and exploit multilayer core decomposition to
approximate this problem with quality guarantees. As further applications, we
show how to utilize multilayer core decomposition to speed-up the extraction of
frequent cross-graph quasi-cliques and to generalize the community-search
problem to the multilayer setting
Explainable Classification of Brain Networks via Contrast Subgraphs
Mining human-brain networks to discover patterns that can be used to
discriminate between healthy individuals and patients affected by some
neurological disorder, is a fundamental task in neuroscience. Learning simple
and interpretable models is as important as mere classification accuracy. In
this paper we introduce a novel approach for classifying brain networks based
on extracting contrast subgraphs, i.e., a set of vertices whose induced
subgraphs are dense in one class of graphs and sparse in the other. We formally
define the problem and present an algorithmic solution for extracting contrast
subgraphs. We then apply our method to a brain-network dataset consisting of
children affected by Autism Spectrum Disorder and children Typically Developed.
Our analysis confirms the interestingness of the discovered patterns, which
match background knowledge in the neuroscience literature. Further analysis on
other classification tasks confirm the simplicity, soundness, and high
explainability of our proposal, which also exhibits superior classification
accuracy, to more complex state-of-the-art methods.Comment: To be published at KDD 202
Analyzing Declarative Deployment Code with Large Language Models
In the cloud-native era, developers have at their disposal an unprecedented landscape of services to build scalable distributed systems. The DevOps paradigm emerged as a response to the increasing necessity of better automations, capable of dealing with the complexity of modern cloud systems. For instance, Infrastructure-as-Code tools provide a declarative way to define, track, and automate changes to the infrastructure underlying a cloud application. Assuring the quality of this part of a code base is of utmost importance. However, learning to produce robust deployment specifications is not an easy feat, and for the domain experts it is time-consuming to conduct code-reviews and transfer the appropriate knowledge to novice members of the team. Given the abundance of data generated throughout the DevOps cycle, machine learning (ML) techniques seem a promising way to tackle this problem. In this work, we propose an approach based on Large Language Models to analyze declarative deployment code and automatically provide QA-related recommendations to developers, such that they can benefit of established best practices and design patterns. We developed a prototype of our proposed ML pipeline, and empirically evaluated our approach on a collection of Kubernetes manifests exported from a repository of internal projects at Nokia Bell Labs
Behavioral Analysis for Virtualized Network Functions : A SOM-based Approach
In this paper, we tackle the problem of detecting anomalous behaviors in a virtualized infrastructure for network function virtualization, proposing to use self-organizing maps for analyzing historical data available through a data center. We propose a joint analysis of system-level metrics, mostly related to resource consumption patterns of the hosted virtual machines, as available through the virtualized infrastructure monitoring system, and the application-level metrics published by individual virtualized network functions through their own monitoring subsystems. Experimental results, obtained by processing real data from one of the NFV data centers of the Vodafone network operator, show that our technique is able to identify specific points in space and time of the recent evolution of the monitored infrastructure that are worth to be investigated by a human operator in order to keep the system running under expected conditions
SOM-based behavioral analysis for virtualized network functions
In this paper, we propose a mechanism based on Self-Organizing Maps for analyzing the resource consumption behaviors and detecting possible anomalies in data centers for Network Function Virtualization (NFV). Our approach is based on a joint analysis of two historical data sets available through two separate monitoring systems: system-level metrics for the physical and virtual machines obtained from the monitoring infrastructure, and application-level metrics available from the individual virtualized network functions. Experimental results, obtained by processing real data from one of the NFV data centers of the Vodafone network operator, highlight some of the capabilities of our system to identify interesting points in space and time of the evolution of the monitored infrastructure
Forecasting Operation Metrics for Virtualized Network Functions
Network Function Virtualization (NFV) is the key technology that allows modern network operators to provide flexible and efficient services, by leveraging on general-purpose private cloud infrastructures. In this work, we investigate the performance of a number of metric forecasting techniques based on machine learning and artificial intelligence, and provide insights on how they can support the decisions of NFV operation teams. Our analysis focuses on both infrastructure-level and service-level metrics. The former can be fetched directly from the monitoring system of an NFV infrastructure, whereas the latter are typically provided by the monitoring components of the individual virtualized network functions. Our selected forecasting techniques are experimentally evaluated using real-life data, exported from a production environment deployed within some Vodafone NFV data centers. The results show what the compared techniques can achieve in terms of the forecasting accuracy and computational cost required to train them on production data
Explainable Classification of Brain Networks via Contrast Subgraphs
Mining human-brain networks to discover patterns that can be used to discriminate between healthy individuals and patients affected by some neurological disorder, is a fundamental task in neuro-science. Learning simple and interpretable models is as important as mere classification accuracy. In this paper we introduce a novel approach for classifying brain networks based on extracting contrast subgraphs, i.e., a set of vertices whose induced subgraphs are dense in one class of graphs and sparse in the other. We formally define the problem and present an algorithmic solution for extracting contrast subgraphs. We then apply our method to a brain-network dataset consisting of children affected by Autism Spectrum Disorder and children Typically Developed. Our analysis confirms the interestingness of the discovered patterns, which match background knowledge in the neuro-science literature. Further analysis on other classification tasks confirm the simplicity, soundness, and high explainability of our proposal, which also exhibits superior classification accuracy, to more complex state-of-the-art methods
XPySom : High-Performance Self-Organizing Maps
In this paper, we introduce XPySom, a new opensource Python implementation of the well-known Self-Organizing Maps (SOM) technique. It is designed to achieve high performance on a single node, exploiting widely available Python libraries for vector processing on multi-core CPUs and GP-GPUs. We present results from an extensive experimental evaluation of XPySom in comparison to widely used open-source SOM implementations, showing that it outperforms the other available alternatives. Indeed, our experimentation carried out using the Extended MNIST open data set shows a speed-up of about 7x and 100x when compared to the best open-source multi-core implementations we could find with multi-core and GP-GPU acceleration, respectively, achieving the same accuracy levels in terms of quantization error
Core Decomposition in Multilayer Networks: Theory, Algorithms and Applications
Multilayer networks are a powerful paradigm to model complex systems, where multiple relations occur between the same entities. Despite the keen interest in a variety of tasks, algorithms, and analyses in this type of network, the problem of extracting dense subgraphs has remained largely unexplored so far.
As a first step in this direction, in this work, we study the problem of core decomposition of a multilayer network. Unlike the single-layer counterpart in which cores are all nested into one another and can be computed in linear time, the multilayer context is much more challenging as no total order exists among multilayer cores; rather, they form a lattice whose size is exponential in the number of layers. In this setting, we devise three algorithms, which differ in the way they visit the core lattice and in their pruning techniques. We assess time and space efficiency of the three algorithms on a large variety of real-world multilayer networks.
We then move a step forward and study the problem of extracting the inner-most (also known as maximal) cores, i.e., the cores that are not dominated by any other core in terms of their core index in all the layers. inner-most cores are typically orders of magnitude less than all the cores. Motivated by this, we devise an algorithm that effectively exploits the maximality property and extracts inner-most cores directly, without first computing a complete decomposition. This allows for a consistent speed up over a naïve method that simply filters out non-inner-most ones from all the cores.
Finally, we showcase the multilayer core-decomposition tool in a variety of scenarios and problems. We start by considering the problem of densest-subgraph extraction in multilayer networks. We introduce a definition of multilayer densest subgraph that tradesoff between high density and number of layers in which the high density holds, and exploit multilayer core decomposition to approximate this problem with quality guarantees. As further applications, we show how to utilize multilayer core decomposition to speed-up the extraction of frequent cross-graph quasi-cliques and to generalize the community-search problem to the multilayer setting
Predictive auto-scaling with OpenStack Monasca
Cloud auto-scaling mechanisms are typically based on reactive
automation rules that scale a cluster whenever some metric,
e.g., the average CPU usage among instances, exceeds a
predefined threshold. Tuning these rules becomes particularly
cumbersome when scaling-up a cluster involves non-negligible
times to bootstrap new instances, as it happens frequently
in production cloud services.
To deal with this problem, we propose an architecture for
auto-scaling cloud services based on the status in which the
system is expected to evolve in the near future. Our approach
leverages on time-series forecasting techniques, like those
based on machine learning and artificial neural networks, to
predict the future dynamics of key metrics, e.g., resource
consumption metrics, and apply a threshold-based scaling
policy on them. The result is a predictive automation policy
that is able, for instance, to automatically anticipate peaks
in the load of a cloud application and trigger ahead of time
appropriate scaling actions to accommodate the expected
increase in traffic.
We prototyped our approach as an open-source OpenStack
component, which relies on, and extends, the monitoring
capabilities offered by Monasca, resulting in the addition
of predictive metrics that can be leveraged by orchestra-
tion components like Heat or Senlin. We show experimental
results using a recurrent neural network and a multi-layer
perceptron as predictor, which are compared with a simple
linear regression and a traditional non-predictive auto-scaling
policy. However, the proposed framework allows for the easy
customization of the prediction policy as neede